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Abstract 


Distant supervision is a widely applied ap¬ 
proach to automatic training of relation 
extraction systems and has the advantage 
that it can generate large amounts of la¬ 
belled data with minimal effort. How¬ 
ever, this data may contain errors and 
consequently systems trained using dis¬ 
tant supervision tend not to perform as 
well as those based on manually labelled 
data. This work proposes a novel method 
for detecting potential false negative train¬ 
ing examples using a knowledge inference 
method. Results show that our approach 
improves the performance of relation ex¬ 
traction systems trained using distantly su¬ 
pervised data. 


1 Introduction 


Distantly supervised relation extraction relies on 
automatically labelled data generated using infor¬ 
mation from a knowledge base. A sentence is 
annotated as a positive example if it contains a 
pair of entities that are related in the knowledge 
base. Negative training data is often generated us¬ 
ing a closed world assumption: pairs of entities not 
listed in the knowledge base are assumed to be un¬ 
related and sentences containing them considered 
to be negative training examples. However this as¬ 
sumption is violated when the knowledge base is 
incomplete which can lead to sentences containing 
instances of relations being wrongly annotated as 
negative examples. 

We propose a method to improve the quality of 
distantly supervised data by identifying possible 
wrongly annotated negative instances. Our pro¬ 
posed method includes a version of the Path Rank¬ 
ing Algorithm (PRA) ( Lao and Cohen, 2010[ Lao 
et al., 20TT] ) which infers relation paths by com¬ 
bining random walks though a knowledge base. 


We use this knowledge inference to detect possi¬ 
ble false negatives (or at least entity pairs closely 
connected to a target relation) in automatically la¬ 
belled training data and show that their removal 
can improve relation extraction performance. 


2 Related Work 


Distant supervision is widely used to train relation 
extraction systems with Freebase and Wikipedia 
commonly being used as knowledge bases, e.g. 
( |Mintz et al., 2009} |Riedel et al., 2010} [Krause 


et al., 20Zhang et al., 2013] Min et al., 2013 


Ritter et al., 2013]). The main advantage is its 


ability to automatically generate large amounts of 
training data automatically. On the other hand, 
this automatically labelled data is noisy and usu¬ 
ally generates lower performance than approaches 
trained using manually labelled data. A range of 
filtering approaches have been applied to address 


this problem including multi-class SVM (Nguyen 


and Moschitti, 201 Ij) and Multi-Instance learn¬ 


ing methods (Riedel et al., 2010 Surdeanu et al.. 


20121. These approaches take into account the fact 


that entities might occur in different relations at 
the same time and may not necessarily express the 
target relation. Other approaches focus directly on 
the noise in the data. For instance [Takamatsu et al.l 


(2012 1 use a generative model to predict incorrect 


data while Intxaurrondo et al. (2013) use a range 


of heuristics including PMI to remove noise. Au- 


genstein et al. (2014]) apply techniques to detect 


highly ambiguous entity pairs and discard them 
from their labelled training set. 


This work proposes a novel approach to the 
problem by applying an inference learning method 
to identify potential false negatives in distantly la¬ 
belled data. Our method makes use of a modi¬ 
fied version of PRA to learn relation paths from a 
knowledge base and uses this information to iden¬ 
tify false negatives. 













































3 Data and Methods 


We chose to apply our approach to relation ex¬ 
traction tasks from the hiomedical domain since 
this has proved to he an important problem within 
these documents ([Jensen et ah, 2006j [Hahn et ^ 


2012t|^hen and Hunter, 20T3| [Roller and Steven- 


son, 20141. In addition, the first application of dis¬ 


tant supervision was to hiomedical journal articles 
( [Craven and Kumlien, 1999[ ). In addition, the most 
widely used knowledge source in this domain, the 
UMLS Metathesaurus ( [Bodenreider, 2004| ), is an 
ideal resource to apply inference learning given its 
rich structure. 

We develop classifiers fo idenfify relations 
found in fwo subsefs of UMLS: the National Drug 
File-Reference Terminology (ND-FRT) and the 
National Cancer Institute Thesaurus (NCI). A cor¬ 
pus of approximately 1,000,000 publications is 
used to create the distantly supervised training 
data. The corpus contains abstracts published be¬ 
tween 1990 and 2001 annotated with UMLS con¬ 


cepts using MetaMap (Aronson and Lang, 20101. 


3.1 Distantly labelled data 

Distant supervision is carried out for a target 
UMLS relation by identifying instance pairs and 
using them to create a set of positive instance 
pairs. Any pairs which also occur as an instance 
pair of another UMLS relation are removed from 
this set. A set of negative instance pairs is then 
created by forming new combinations that do not 
occur within the positive instance pairs. Sentences 
containing a positive or negative instance pair are 
then extracted to generate positive and negative 
training examples for the relation. These candi¬ 


date sentences are then stemmed (Porter, 19971 


and PoS tagged ( [Charniak and Johnson, 2005 1. 

The sets of positive and negative training exam¬ 
ples are then filtered to remove sentences that meet 
any of the following criteria: contain the same 
positive pair more than once; contain both a posi¬ 
tive and negative pair; more than 5 words between 
the two elements of the instance pair; contain very 
common instance pairs. 

3.2 PRA-Reduction 


PRA ( [Lao and Cohen, 2010 Lao et ah, 2011 1 
is an algorithm that infers new relation instances 
from knowledge bases. By considering a knowl¬ 
edge base as a graph, where nodes are connected 
through typed relations, it performs random walks 


over it and finds bounded-lengfh relation pafhs fhaf 
connecf graph nodes. These pafhs are used as 
feafures in a logisfic regression model, which is 
mean! fo predicf new relafions in fhe graph. Al- 
fhough inifially conceived as an algorifhm fo dis¬ 
cover new links in fhe knowledge base, PRA can 
also be used fo learn relevanf relation pafhs for 
any given relation. For insfance, if x and y are 
relafed via sibling relation, fhe model frained by 
PRA would learn fhaf fhe relafion pafh parent(x,a) 
A _parent{ a,y;Qis highly relevanf, as siblings share 
fhe same parenfs. 

Knowledge graphs were exfracfed from the ND- 
FRT and NCI vocabularies generating approxi¬ 
mately 200,000 related instance pairs for ND- 
FRT and 400,000 for NCI. PRA is then run on 
both graphs in order to learn paths for each tar¬ 
get relation. Table [T] shows examples of the paths 
PRA generated for the relation biological-process- 
involves-gene-product together with their weights. 
We only make use of relation paths with positive 
weights generated by PRA. 


path 

weight 

gene-encodes-gene-product(x,a) A _gene- 
plays-role-in-process(a,)') 

10.53 

jsa(x,a) A biological-process-involves-gene- 
product(a,y) 

6.17 

isa(x,a) A biological-process-involves-gene- 
productta,^) 

2.80 

gene-encodes-gene-product(x,a) A .gene- 
play s-role-in-process(a,fi) A isa(fi,y) 

-0.06 


Table 1: Example PRA-induced paths and weights 
for the NCI relation biological-process-involves- 
gene-product. 


The paths induced by PRA are used to iden¬ 
tify potential false negatives in the negative train¬ 
ing examples (Section [3U] ). Each negative training 
example is examined to check whether the entity 
pair is related in UMES by following any of the 
relation paths extracted by PRA for the relevant 
target relation. Examples containing related en¬ 
tity pairs are assumed to be false negatives, since 
the relation can be inferred from the knowledge 
base, and removed from the set of negatives train¬ 
ing examples. Eor instance, using the path in the 
top row of Table [T] sentences containing the enti¬ 
ties X and y would be removed if the path gene- 
encodes-gene-product(x,a) A .gene-plays-role-in- 
process(a,y) could be identified wifhin UMES. 

’An underline prefix represents the inverse of a rela¬ 
tion while A represents path composition. 









































3.3 Evaluation 

Relation Extraction system: The MultiR system 
with features described by 
was used for the experi- 


Datasets: Three datasets were created to train 
MultiR and evaluate performance. The first (Un¬ 
filtered) uses the data obtained using distant su¬ 
pervision (Section [TT] ) without removing any ex¬ 
amples identified by PRA. The overall rafio of 
posifive fo negafive sentences in fhis dafasef was 
1:5.1. However, fhis changes fo 1:2.3 after remov¬ 
ing examples idenfified by PRA. Consequenfly fhe 
bias in fhe disfanfly supervised dafa was adjusfed 
fo 1:2 fo increase comparabilify across configura¬ 
tions. Reducing bias was also found fo increase re- 
lafion exfracfion performance, producing a sfrong 
baseline. The PRA-reduced dafasef is creafed by 
applying PRA reducfion (Section |3.2| ) fo fhe Un¬ 
filtered dafasef fo remove a porfion of fhe nega¬ 
five fraining examples. Removing fhese examples 
produces a dafasef fhaf is smaller fhan Unfiltered 
and wifh a differenf bias. Changing fhe bias of 
fhe fraining dafa can influence fhe classification re- 
sulfs. Consequenfly fhe Random-reduced dafasef 
was creafed by removing randomly selecfed nega¬ 
tive examples from Unfiltered fo produce a dafasef 
wifh fhe same size and bias as PRA-reduced. The 
Random-reduced dafasef is used fo show fhaf ran¬ 
domly removing negative insfances leads fo lower 
resulfs fhan removing fhose suggesfed by PRA. 

Evaluation: Two approaches were used to eval¬ 
uate performance. 

The Held-out datasets consist of the Unfiltered, 
PRA-reduced and Random-reduced data sets. The 
set of entity pairs obtained from the knowledge 
base is split into four parts and a process similar 
to 4-fold cross validation applied. In each fold the 
automatically labelled sentences obtained from the 
pairs in 3 of the quarters are used as training data 
and sentences obtained from the remaining quarter 
used for testing. 

The Manually labelled dataset contains 400 
examples of the relation may-prevent and 400 of 
may-treat which were manually labelled by two 
annotators who were medical experts. Both rela¬ 
tions are taken from the ND-FRT subset of UMLS. 
Each annotator was asked to label every sentence 
and then re-examine cases where there was dis¬ 
agreement. This process lead to inter-annotator 
agreement of 95.5% for may-treat and 97.3% for 


may-prevent. The annotated data set is publicly 
available]^ Any sentences in the training data con¬ 
taining an entity pair that occurs within the man¬ 
ually labelled dataset are removed. Although this 
dataset is smaller than the held-out dataset, its an¬ 
notations are more reliable and it is therefore likely 
to be a more accurate indicator of performance ac¬ 
curacy. This dataset is more balanced than the 
held-out data with a ratio of 1:1.3 for may-treat 
and 1:1.8 for may-prevent. 

Evaluation metric: Our experiments use en¬ 
tity level evaluation since this is the most appropri¬ 
ate approach to determine suitability for database 
population. Precision and recall are computed 
based on the proportion of entity pairs identified. 
For the held-out data the set of correct entity pairs 
are those which occur in sentences labeled as pos¬ 
itive examples of the relation and which are also 
listed as being related in UMLS. For the manually 
labelled data it is simply the set of entity pairs that 
occur in positive examples of the relation. 

4 Results 

4.1 Held-out data 

Table shows the results obtained using the held- 
out data. Overall results, averaged across all re¬ 
lations with maximum recall, are shown in the top 
portion of the table and indicate that applying PRA 
improves performance. Although the highest pre¬ 
cision is obtained using the Unfiltered classifier, 
the PRA-reduced classifier leads to the best recall 
and FI. Performance of the Random-reduced clas¬ 
sifier indicates that the improvement is not simply 
due to a change in the bias in the data but that the 
examples it contains lead to an improved model. 

The lower part of Tablej^shows results for each 
relation. The PRA-reduced classifier produces the 
best results for the majority of relations and always 
increases recall compared to Unfiltered. 

It is perhaps surprising that removing false neg¬ 
atives from the training data leads to an increase 
in recall, rather than precision. False negatives 
cause the classifier to generate an overly restrictive 
model of the relation and to predict positive ex¬ 
amples of a relation as negative. Removing them 
leads to a less constrained model and higher recall. 

There are two relations where there is also an in¬ 
crease in precision {contraindicating-class-of and 
mechanism-of-action-of) and these are also the 
ones for which the fewest training examples are 

^https://sites.google.coni/site/umlscorpus/home 


(Hoffmann et ah, 2010 


Surdeanu et al. (201 1| 
ments. 








Unfiltered 

Random-reduced 

PRA-reduced 


Free. 

Rec. 

FI 

Free. 

Rec. 

Ft 

Free. 

Rec. 

FI 

Overall 

62.30 

51.82 

56.58 

44.49 

74.26 

55.64 

56.85 

77.10 

65.44 


NCI relations 

biologicaLprocess_involves_gene .product 

89.61 

43.18 

57.86 

65.67 

78.79 

71.38 

70.63 

84.85 

76.97 

disease Jias_normaLcelLorigin 

60.20 

83.86 

69.95 

43.2 

95.21 

58.85 

42.80 

91.88 

57.91 

gene .product Jias _as sociated.anatomy 

41.65 

64.04 

49.96 

29.22 

74.63 

41.81 

37.94 

65.28 

47.82 

gene .product Jias.biochemicaLfunction 

86.43 

72.00 

78.33 

60.66 

91.57 

72.90 

70.58 

95.80 

81.17 

process jnvolves.gene 

78.92 

50.71 

61.54 

51.38 

80.64 

62.73 

68.16 

87.34 

76.47 


ND-FRT relations 

contraindicating.class.of 

40.00 

20.83 

26.14 

28.48 

72.50 

39.58 

41.30 

82.50 

54.33 

may .prevent 

27.48 

14.69 

18.87 

20.61 

44.79 

27.94 

38.11 

35.63 

36.64 

may .treat 

48.66 

39.63 

43.14 

39.57 

50.00 

43.84 

50.88 

57.93 

54.11 

mechanism.of.action.of 

47.15 

40.63 

43.12 

40.25 

59.38 

47.62 

52.85 

59.38 

55.82 


Table 2: Evaluation using held-out data 



Unfiltered 

Random-reduced 

PRA-reduced 

relation 

Free. 

Rec. 

FI 

Free. 

Rec. 

FI 

Free. 

Rec. 

FI 

may .prevent 
may .treat 

54.17 

40.00 

21.67 

47.48 

30.95 

43.42 

53.57 

43.21 

25.00 

50.36 

34.09 

46.51 

39.66 

41.05 

38.33 

67.63 

38.98 

51.09 


Table 3: Evaluation using manually labelled data 



Eigure 1: Precision/Recall Curve for Held-out data 

available. The classifier has access to such a lim¬ 
ited amount of data for these relations that remov¬ 
ing the false negatives identified by PRA allows it 
to learn a more accurate model. 

Eigure [T] presents a precision/recall curve com¬ 
puted using MultiR’s output probabilities. Results 
for the PRA-reduced and the Random-reduced 
classifiers show that reducing the amount of nega¬ 
tive training data increases recall. However, using 
PRA-reduced generally leads to higher precision, 
indicating that PRA is able to identify suitable in¬ 
stances for removal from the training set. The Un¬ 
filtered classifier produces good results but preci¬ 
sion and recall are lower than PRA-reduced. 

4.2 Manually labelled 

Table |3] shows results of evaluation on the more 
reliable manually labelled data set. The best over¬ 


all performance is once again obtained using the 
PRA-reduced classifier. There is an increase in re¬ 
call for both relations and a slight increase in pre¬ 
cision for mayJreat. Performance of the Random- 
reduced classifier also improves due fo an increas¬ 
ing recall but remains below PRA-reduced. Per¬ 
formance of the Random-reduced classiher is also 
better than Unfiltered, with the overall improve¬ 
ment largely resulting from increased recall, but 
below PRA-reduced. These results conhrm that re¬ 
moving examples identihed by PRA improves the 
quality of training data. 

Eurther analysis indicated that the PRA-reduced 
classiher produces the fewest false negatives in its 
predictions on the manually annotated dataset. It 
incorrectly labels 82 entity pairs (45 may-treat, 37 
may-prevent) as negative while Unfiltered predicts 
120 (73, 47) and Random-reduced 114 (69, 45). 
This supports our initial hypothesis that remov¬ 
ing potential false negatives from training data im¬ 
proves classiher predictions. 

5 Conclusions and Future Work 

This paper proposes a novel approach to identify¬ 
ing incorrectly labelled instances generated using 
distant supervision. Our method applies an infer¬ 
ence learning method to detect and discard pos¬ 
sible false negatives from the training data. We 
show that our method improves performance for 
a range of relations in the biomedical domain by 
making use of information from UMES. 

In future we would like to explore alternative 





























methods for selecting PRA relation paths to iden¬ 
tify false negatives. Furthermore we would like 
to examine the PRA-reduced data in more detail. 
We would like to find which kind of entity pairs 
are detected by our proposed method and whether 
the reduced data can also be used to extend the 
positive training data. We would also like to ap¬ 
ply the approach to other domains and alternative 
knowledge bases. Finally it would be interesting 
to compare our approach to other state of the art re¬ 
lation extraction systems for distant supervision or 
biased-SVM approaches such as|Liu et al. (2003|). 


Acknowledgements 

The authors are grateful to the Engineering 
and Physical Sciences Research Council for 
supporting the work described in this paper 
(EP/J008427/1). 


References 

[Aronson and Lang2010] A. Aronson and F. Lang. 
2010. An overview of MetaMap; historical perspec¬ 
tive and recent advances. Journal of the American 
Medical Association, 17(3):229-236. 

[Augenstein et al.2014] Isabelle Augenstein, Diana 
Maynard, and Fabio Ciravegna. 2014. Relation 
extraction from the web using distant supervision. 
In Proceedings of the 19th International Confer¬ 
ence on Knowledge Engineering and Knowledge 
Management (EKAW 2014), Linkoping, Sweden, 
November. 

[Bodenreider2004] Olivier Bodenreider. 2004. The 
unified medical language system (umls): integrat¬ 
ing biomedical terminology. Nucleic acids research, 
32(suppl 1);D267-D270. 

[Charniak and Johnson2005] Eugene Charniak and 
Mark Johnson. 2005. Coarse-to-fine n-best parsing 
and maxent discriminative reranking. In Proceed¬ 
ings of the 43rd Annual Meeting on Association 
for Computational Linguistics, ACL ’05, pages 
173-180, Stroudsburg, PA, USA. Association for 
Computational Linguistics. 

[Cohen and Hunter2013] K Bretonnel Cohen and 
Lawrence E Hunter. 2013. Text mining for transla¬ 
tional bioinformatics. PLoS computational biology, 
9(4):el003044. 

[Craven and Kumlienl999] Mark Craven and Johan 
Kumlien. 1999. Constracting biological knowledge 
bases by extracting information from text sources. 
In In Proceedings of the Seventh International Con¬ 
ference on Intelligent Systems for Molecular Biology 
(ISMB), pages 77-^6. AAAI Press. 


[Hahn et al.2012] Udo Hahn, K Bretonnel Cohen, Yael 
Garten, and Nigam H Shah. 2012. Mining the phar- 
macogenomics literaturea survey of the state of the 
art. Briefings in bioinformatics, 13(4):460^94. 

[Hoffmann et al.2010] Raphael Hoffmann, Congle 
Zhang, and Daniel S. Weld. 2010. Learning 5000 
relational extractors. In Proceedings of the 48th An¬ 
nual Meeting of the Association for Computational 
Linguistics, ACL ’10, pages 286-295, Strouds¬ 
burg, PA, USA. Association for Computational 
Linguistics. 

[Intxaurrondo et al.2013] Ander Intxaurrondo, Mihai 
Surdeanu, Oier Lopez de Lacalle, and Eneko Agirre. 
2013. Removing noisy mentions for distant supervi¬ 
sion. Procesamiento del Lenguaje Natural, 51:41- 
48. 

[Jensen et al.2006] Lars Juhl Jensen, Jasmin Saric, and 
Peer Bork. 2006. Literature mining for the biolo¬ 
gist: from information retrieval to biological discov¬ 
ery. Nature reviews genetics, 7(2): 119-129. 

[Krause et al.2012] Sebastian Krause, Hong Li, Hans 
Uszkoreit, and Eeiyu Xu. 2012. Large-scale learn¬ 
ing of relation-extraction rules with distant supervi¬ 
sion from the web. In Proceedings of the 11th In¬ 
ternational Conference on The Semantic Web - Vol¬ 
ume Part I, ISWC’12, pages 263-278, Berlin, Hei¬ 
delberg. Springer-Verlag. 

[Lao and Cohen2010] Ni Lao and William W. Cohen. 
2010. Relational retrieval using a combination 
of path-constrained random walks. Mach. Learn., 
81(l):53-67, October. 

[Lao et al.2011] Ni Lao, Tom Mitchell, and William W. 
Cohen. 2011. Random walk inference and learn¬ 
ing in a large scale knowledge base. In Proceedings 
of the 2011 Conference on Empirical Methods in 
Natural Language Processing, pages 529-539, Ed¬ 
inburgh, Scotland, UK., July. Association for Com¬ 
putational Linguistics. 

[Liu et al.2003] Bing Liu, Yang Dai, Xiaoli Li, Wee Sun 
Lee, and Philip S. Yu. 2003. Building text classi¬ 
fiers using positive and unlabeled examples. In Inti. 
Conf. on Data Mining, pages 179-188. 

[Min et al.2013] Bonan Min, Ralph Grishman, Li Wan, 
Chang Wang, and David Gondek. 2013. Distant su¬ 
pervision for relation extraction with an incomplete 
knowledge base. In Proceedings of the 2013 Con¬ 
ference of the North American Chapter of the Asso¬ 
ciation for Computational Linguistics: Human Lan¬ 
guage Technologies, pages 777-782, Atlanta, Geor¬ 
gia, June. Association for Computational Linguis¬ 
tics. 

[Mintz et al.2009] Mike Mintz, Steven Bills, Rion 
Snow, and Dan Jurafsky. 2009. Distant supervi¬ 
sion for relation extraction without labeled data. In 
Proceedings of the Joint Conference of the 47th An¬ 
nual Meeting of the ACL and the 4th International 



Joint Conference on Natural Language Processing 
of the AFNLP: Volume 2 - Volume 2, ACL ’09, pages 
1003-1011, Stroudsburg, PA, USA. Association for 
Computational Linguistics. 

[Nguyen and Moschitti2011] Truc-Vien T. Nguyen and 
Alessandro Moschitti. 2011. End-to-end relation 
extraction using distant supervision from external 
semantic repositories. In Proceedings of the 49th 
Annual Meeting of the Association for Computa¬ 
tional Linguistics: Human Language Technologies: 
short papers - Volume 2, HLT ’ll, pages 277-282, 
Stroudsburg, PA, USA. Association for Computa¬ 
tional Linguistics. 

[Porterl997] M. F. Porter. 1997. Readings in informa¬ 
tion retrieval, chapter An Algorithm for Suffix Strip¬ 
ping, pages 313-316. Morgan Kaufmann Publishers 
Inc., San Francisco, CA, USA. 

[Riedel et al.2010] Sebastian Riedel, Limin Yao, and 
Andrew McCallum. 2010. Modeling relations and 
their mentions without labeled text. In Proceed¬ 
ings of the European Conference on Machine Learn¬ 
ing and Knowledge Discovery in Databases (ECML 
PKDD ’10). 

[Ritter et aI.20I3] Alan Ritter, Luke Zettlemoyer, Oren 
Etzioni, et al. 2013. Modeling missing data in dis¬ 
tant supervision for information extraction. Trans¬ 
actions of the Association for Computational Lin¬ 
guistics, 1:367-378. 

[Roller and Stevenson2014] Roland Roller and Mark 
Stevenson. 2014. Self-supervised relation extrac¬ 
tion using umls. In Proceedings of the Conference 
and Labs of the Evaluation Forum 2014, Sheffield, 
England. 

[Surdeanu et al.2011] Mihai Surdeanu, David Mc- 
Closky, Mason Smith, Andrey Gusev, and Christo¬ 
pher Manning. 2011. Customizing an information 
extraction system to a new domain. In Proceedings 
of the ACL 2011 Workshop on Relational Models 
of Semantics, pages 2-10, Portland, Oregon, USA, 
June. Association for Computational Linguistics. 

[Surdeanu et al.2012] Mihai Surdeanu, Julie Tibshirani, 
Ramesh Nallapati, and Christopher D. Manning. 
2012. Multi-instance multi-label learning for rela¬ 
tion extraction. In Proceedings of the 2012 Joint 
Conference on Empirical Methods in Natural Lan¬ 
guage Processing and Computational Natural Lan¬ 
guage Learning, EMNLP-CoNLL ’12, pages 455- 
465, Stroudsburg, PA, USA. Association for Com¬ 
putational Linguistics. 

[Takamatsu et al.2012] Shingo Takamatsu, Issei Sato, 
and Hiroshi Nakagawa. 2012. Reducing wrong la¬ 
bels in distant supervision for relation extraction. In 
Proceedings of the 50th Annual Meeting of the Asso¬ 
ciation for Computational Linguistics: Long Papers 
- Volume 1, ACL ’12, pages 721-729, Stroudsburg, 
PA, USA. Association for Computational Linguis¬ 
tics. 


[Zhang et al.2013] Xingxing Zhang, Jianwen Zhang, 
Junyu Zeng, Jun Yan, Zheng Chen, and Zhifang Sui. 
2013. Towards accurate distant supervision for rela¬ 
tional facts extraction. In Proceedings of the 51st 
Annual Meeting of the Association for Computa¬ 
tional Linguistics (Volume 2: Short Papers), pages 
810-815, Sofia, Bulgaria, August. Association for 
Computational Linguistics. 



